perm filename 2[00,BGB] blob
sn#047821 filedate 1973-06-07 generic text, type C, neo UTF8
COMMENT ⊗ VALID 00011 PAGES
C REC PAGE DESCRIPTION
C00001 00001
C00002 00002 ~F8THE ALGORITHM: INTRODUCTION.
C00005 00003 ~F81. THRESHOLDING.
C00009 00004 ~F82. CONTOURING.
C00016 00005 ~F83. NESTING.
C00022 00006 ~F83. NESTING.
C00025 00007 " The image tree is generated one threshold level at a time,
C00030 00008 ~F84. SMOOTHING.
C00036 00009 ~F85. COMPARING.
C00040 00010 ~F85. COMPARING.
C00044 00011 ~F8SUMMARY OF LAMINA INERTIA TENSOR EXPRESSIONS.
C00047 ENDMK
C⊗;
~F8THE ALGORITHM: INTRODUCTION.
CRE consists of five steps: thresholding, contouring,
nesting, smoothing and comparing. Thresholding, contouring and
smoothing perform conversions between two different kinds of images.
Nesting and contouring compute topological relationships within a
given image representation. In summary the major operations are:
MAJOR OPERATION. OPERAND. RESULT.
1. THRESHOLDING: 6-BIT-IMAGE, 1-BIT-IMAGES.
2. CONTOURING: 1-BIT-IMAGES, VIC-IMAGE.
3. NESTING: VIC-IMAGE, NESTED-VIC-IMAGE.
4. SMOOTHING: VIC-IMAGE, ARC-IMAGE.
5. COMPARING: IMAGE & FILM, FILM.
Although the natural order of operations is sequential from
image thresholding to image comparing; in order to keep memory size
down, the first four steps are applied one intensity level at a time
from the darkest cut to the lightest cut (only nesting depends on
this sequential cut order); and comparing is applied to whole
images.
The illustrations on pages 21 and 23 show an initial video
image and its final smoothed contour image; the illustrations
immediately below and on page 24 the corresponding intermediate
sawtoothed contour images. The illustrated images are each composed
of seven intensity levels, and took 16 seconds and 13 seconds to
compute respectively (on a PDP-10, 2usec memory). The final CRE
data structures contained 680 and 293 nodes respectives, which comes
to 2K and 4.5K words respectively; the initial video image requires
10.2K words.
FIGURE: PUMP SAW TOOTHED CONTOURS.
~I1973,800;F8- 22 -
~F81. THRESHOLDING.
Thresholding, the first and easiest step, consists of two
subroutines, called THRESH and PACXOR. THRESH converts a 6-bit
image into a 1-bit image with respect to a given threshold cut level
between zero for black and sixty-three for light. All pixels equal
to or greater than the cut, map into a one; all the pixels less than
the cut, map into zero. The resulting 1-bit image is stored in a bit
array of 216 rows by 288 columns (1728 words) called the PAC
(picture accumulator) which was named in memory of McCormick's
ILLIAC-III. After THRESH, the PAC contains blobs of bits. A blob
is defined as "rook's move" simply connected; that is every bit of a
blob can be reached by horizontal or vertical moves from any other
bit without having to cross a zero bit or having to make a diagonal
(bishop's) move. Blobs may of course have holes. Or equalvalently
a blob always has one outer perimeter polygon, and may have one,
several or no inner perimeter polygons. This blob and hole topology
is recoverible from the CRE data structure and is built by the
nesting step.
Next, PACXOR copies the PAC into two slightly larger bit
arrays named HSEG and VSEG. Then the PAC is shifted down one row and
exclusive OR'ed into the HSEG array; and the PAC is shifted right
one column and exclusive OR'ed into the VSEG array to compute the
horizontal and vertical border bits of the PAC blobs. Notice, that
this is the very heart of the "edge finder" of CRE. Namely, PACXOR
is the mechanism that converts regions into edges.
~I1973,800;F8- 24 -
~F82. CONTOURING.
Contouring, converts the bit arrays HSEG and VSEG into
vectors and polygons. The contouring itself, is done by a single
subroutine named MKPGON, make polygon. When MKPGON is called, it
looks for the upper most left non-zero bit in the VSEG array. If the
VSEG array is empty, MKPGON returns a NIL. However, when the bit is
found, MKPGON traces and erases the polygonal outline to which that
bit belongs and returns a polygon node with a ring of vectors.
To belabor the details (for the sake of later complexities);
the MKGON trace can go in four directions: north and south along
vertical columns of bits in the VSEG array, or east and west along
horizontal rows of the HSEG array. The trace starts by heading south
until it hits a turn; when heading south MKPGON must check for first
a turn to the east (indicated by a bit in HSEG); next for no turn
(continue south); and last for a turn to the west. When a turn is
encountered MKPGON creates a vector node representing the run of
bits between the previous turn and the present turn. The trace
always ends heading west bound. The outline so traced can be either
the edge of a blob or a hole, the two cases are distinguished by
looking at the VIC-polygon's uppermost left pixel in the PAC bit
array.
There are two complexities: contrast accumulation and
dekinking. The contrast of a vector is defined as (QUOTIENT
(DIFFERENCE (Sum of pixel values on one side of the vector)(Sum of
pixel values on the other side of the vector)) (length of the vector
in pixels)). Since vectors are always either horizontal or vertical
and are construed as being on the cracks between pixels; the
specified summations refer to the pixels immediately to either side
of the vector. Notice that this definition of constrast will always
give a positive constrast for vectors of a blob and negative
contrast for the vectors of a hole.
The terms "jaggies", "kinks" and "sawtooth" all are used to
express what seems to be wrong about the lowermost left polygon on
page 25. The problem involves doing something to a rectilinear
quantized set of segments, to make its linear nature more evident.
The CRE jaggies solution (in the subroutine MKPGON) merely positions
the turning locus diagonally off its grid point alittle in the
direction (northeast, northwest, southwest or southeast) that
bisects the turn's right angle. The distance of dekink vernier
positioning is always less than half a pixel; but greater for
brighter cuts and less for the darker cuts; in order to preserve the
nesting of contours. The saw toothed and the dekinked versions of a
polygon have the same number of vectors. I am very fond of this
dekinking algorithm because of its incredible efficiency; given that
you have a north, south, east, west polygon trace routine (which
handles image coordinates packed row, column into one accumulator
word); then dekinking requires only one more ADD instruction
execution per vector ! ~I1973,800;F8- 26 -
~F83. NESTING.
The nesting problem is to decide whether one contour polygon
is within another. Although easy in the two polygon case; solving
the nesting of many polygons with respect to each other becomes
n-squared expensive in either compute time or in memory space. The
nesting solution in CRE sacrifices memory for the sake of greater
speed and requires a 31K array, called the SKY.
CRE's accumulation of a properly nested tree of polygons
depends on the order of threshold cutting going from dark to light.
For each polygon there are two nesting steps: first, the polygon is
placed in the tree of nested polygons by the subroutine INTREE;
second, the polygon is placed in the SKY array by the subroutine
named INSKY.
The SKY array is 216 rows of 289 columns of 18-bit pointers.
The name "SKY" came about because, the array use to represent the
furthest away regions or background, which in the case of a robot
vehicle is the real sky blue. The sky contains vector pointers; and
would be more efficient on a virtual memory machine that didn't
allocate unused pages of its address space. Whereas most computers
have more memory containers than address space; computer graphics
and vision might be easier to program in a memory with more address
space than physical space; i.e. an almost empty virtual memory.
The first part of the INTREE routine finds the surrounder of
a given polygon by scanning the SKY due east from the uppermost left
pixel of the given polygon. The SON of a polygon is always its
uppermost left vector. After INTREE, the INSKY routine places
pointers to the vertical vectors of the given polygon into the sky
array.
The second part of the INTREE routine checks for and fixes
up the case where the new polygon captures a polygon that is already
enclaved. This only happens when two or more levels of the image
have blobs that have holes. The next paragraph expalains the arcane
details of fixing up the tree links of multi level hole polygons and
the box following that is a quotation from the appendix of Krakauer
thesis [3] describing his nesting algorithm.
~F83. NESTING.
Let the given polygon be named Poly; and let the surrounder
of Poly be called Exopoly; and assume that Exopoly surrounds several
enclaved polygons called "endo's", which are already in the nested
polygon tree. Also, there are two kinds of temporary lists named the
PLIST and the NLIST. There is one PLIST which is initially a list of
all the ENDO polygons on Exopoly's ENDO ring. Each endo in turn has
an NLIST which is initially empty. The subroutine INTREE re-scans
the sky array for the polygon due east of the uppermost left vector
of each endo polygon on the PLIST, (Exopoly's ENDO ring). On such
re-scanning, (on behalf of say an Endo1), there are four cases:
1. No change; the scan returns Exopoly;
which is Endo1's original EXO.
2. Poly captures Endo1; the scan returns Poly indicating
that endo1 has been captured by Poly.
3. My brothers fate; the scan hits an endo2 which
is not on the PLIST; which means that endo2's EXO is valid
and is the valid EXO of endo1.
4. My fate delayed; the scan hits an endo2 which is still
on the PLIST; which means that endo2's EXO is not yet
valid but when discovered it will also be Endo1's EXO;
so Endo1 is CONS'ed into Endo2's NLIST.
When an endo polygon's EXO has been re-discovered, then all the
polygons on that endo's NLIST are also placed into the polygon tree
at that place. All of this link crunching machinery takes half a page
of code and is not frequently executed.
" The image tree is generated one threshold level at a time,
starting at the highest level (branch tips). At each level, the
image is scanned, and the points above the threshold are marked in a
scratch array. This scatch array is then scanned for marked points.
When one is found, a contiguity routine is called, which visits all
marked points which can be reached from the start via a connected
path. The marks are erased by this routine as it goes, and
statistics are kept on the region thus generated, such as the sums
of the x and y coordinates of the points, and the sum of the
squares of the x and y coordinates (used to compute the centerand
the eccentricity). A tree node is then made up for the region, and
the scan for marked points continues. A special mark is left in the
scratch array for each region. When this mark is encountered during
the scan at the next level, it is looked up on an association list.
This establishes the link between a region and the regions which are
a subset of it at the previous level - i.e. between a node and its
sub-nodes. "
" The contiguity scan is the most complex program. It works by
leaving directional pointers in the scratch array. These are
three-bit codes denoting one of the eight possible neighboring
points. The contiguity scan is always started at a point which is on
the bottom edge of the region. It traces along this edge to the
right by moving from one marked point to the next, but always
keeping an un-marked point to the right side. As it goes, it erases
the marks, so that for a region with smooth boundaries, it will
follow a spiral path to the center, "eating up" the marks as it
goes, like a lathe with the tool continually advancing into the
work. "
" As the contiguity routine scans, it lays down back pointers
in the scratch array which enable it to retrace its path back to the
start. If a dead end is reached (no more marked neighbors), it
traces back along this path, looking for marked points to the right.
There can be no marked points on the left side while backtracking,
since this was the right side on the way out, and the outgoing scan
stayed as far to the right as possible. If a marked point is found
on the backtrace, it is replaced with a pointer to the adjacent path
already traced out, and then a new path is traced as if this were a
new starting point. When the backtrace reaches the original starting
point, the contiguity scan is completed. The effect of this
algorithm is to construct a tree of pointers in the scratch array,
with the starting point at the root. All points which can be reached
via a connected path from the starting point will be a part of this
tree. "
~F84. SMOOTHING.
In CRE the term "smoothing" refers more to the problem of
breaking a manifold (polygon) into functions (arcs), rather than to
the problem of fitting functions to measured data. The smoothing
step, converts the polygons of vertical and horizontal vectors into
polygons of arcs. For the present the term "arc" means "linear arc"
which is a line segment. Fancier arcs: circular and cubic spline
were implemented and thrown out mostly because they were of no use
to higher processes such as the polygon compare which would break
the fancier arcs back down into linear vectors for computing areas,
inertia tensors or mere display buffers.
Smoothing is applied to each polygon of a level. To start
the smoothing, a ring of two arcs is formed (a bi-gon) with one arc
at the uppermost left and the other at the lowermost right of the
given vector polygon. Next a recursive make arc operation, MKARC, is
is appled to the two initial arcs. Since the arc given to MKARC is
in a one to one correspondece with a doubly linked list of vectors;
MKARC checks to see whether each point on the list of vectors is
close enough to the approximating arc. MKARC returns the given arc
as good enough when all the sub vectors fall within a given width;
otherwise MKARC splits the arc in two and places a new arc vertex on
the vector vertex that was furthest away from the original arc.
The two large images in figure-7, illustrate a polygon
smoothed with arc width tolerances set at two different widths in
order to show one recursion of MKARC. The eight smaller images
illustrate the results of setting the arc width tolerance over a
range of values. Because of the dekinking mentioned earlier the arc
width tolerance can be equal to or less than 1.0 pixels and still
expect a substantial reduction in the number of vectors it takes to
describe a contour polygon.
A final important smoothing detail is that the arc width
tolerance is actually taken as a function of the highest contrast
vector found along the arc; so that high contrast arcs are smoothed
with much smaller arc width tolerances than are low contrast arcs.
After smoothing, the contrast across each arc is computed and the
ring of arcs replaces the ring of vectors of the given polygon.
(Polygons that would be expressed as only two arcs are deleted).
~I1973,800;F8- 30 -
~F85. COMPARING.
The compare step of CRE, CMPARE, connects the polygons and
arcs of the current image with corresponding polygons and arcs of
the previous image. CMPARE solves the problem of correlating
features between two similar images and is composed four sub
sections:
1. make shape nodes for polygons.
2. compare and connect polygons one to one.
3. compare and connect polygons two to one.
4. compare and connect vertices of connected polygons.
First, the shape nodes of all the polygons of an image are
computed. The shape node contains the center of mass and the lamina
inertia tensor of a polygon. The lamina inertia tensor of a polygon
with N sides is computed by summation over N trapezoids. The
trapezoid corresponding to each side is formed by dropping
perpendiculars "up" to the top of the image frame; each such
trapezoid consists of a rectangle an a right triangle; since the
sides of polygons are directed vectors the areas of the triangles
and rectangles can be arranged to take positive and negative values
such that a summation will describe the interior region of the
polygon as positive. The equations necessary for computing the
lamina inertia tensor of a polygon are collected in a table in the
postscripts to this paper and were derived by using Goldstein's
Classical Mechanics [1] as a reference. The meaning of the inertia
tensor is that it characterizes each polygon by a rectangle of a
certain length and width at a particular location and oriention; and
of further importance such inertia tensors can be "added" to
characterize two or more polygons by a single rectangle. It is the
lamina inertia tensor rectangles that are actually compared by CRE.
Second, all the shapes of the polygons of one level of the
first image are compared with all the shapes of the polygons of the
corresponding level of the second image for nearly exact match. The
potentially (M*N/2) compares is avoided by sorting on the center of
mass locations. In CRE, which is intended for comparing sequences
of pictures of natural scenes; match for center of mass location is
tested first and most strictly, followed by match for inertia.
Pointers between matching polygons are placed in the time link
positions of the polygon nodes and the polygons are considered to be
mated in time. ~I1973,800;F8- 32 -
~F85. COMPARING.
Third, all the unmated polygons of a level are considered
two at a time and a fusion shape node for each pair is made. The
potentially (N*N/2-N) fusion shapes are avoided because there is a
maximum possible unmated inertia in the other image; lo, if there
are no unmated polygons in one image then the extra polygons of the
first image can be ignored. In the event where there are unmated
polygons in corresponding levels of the two images, the fusion
shapes of one are compared with the polygon shapes of the other.
The fusion (fission) compare solves the rather nasty problem,
illustrated in figures 9A and 9B of linking two contour polygons of
one image with a single contour polygon in the next image.
Fourth, the vertices of polygons mated in time are compared
and mated. To start a vertex compare, the vertices of one polygon
are translated, rotated and dilated to get that polygon's lamina
inertia tensor coincidant with its mate (or mates). Conceptually,
each vertex of one polygon is compared with each vertex of the other
polygon(s) and the mutually closest vertices (closer than an
epsilon) are considered to be mated. Actually the potential (N*M)
compares is avoided by a window splitting scheme similiar to that
used in hidden line elimination algorithms (like Warnock's).
The results of vertex compare and mate are illustrated in
figures 9A and 9D; the compare execution takes less than a second on
images such as the pump, blocks, and dolls that have appeared in
this paper. The applications of this compare might include the
aiming of a pixel correlation comparator (such as Quam's);
recognition and location of an expected object; or the location and
extent of an unknown object. It is this latter application that
will be described in my forthcoming thesis.
I1973,800;F8- 33 -
~F8SUMMARY OF LAMINA INERTIA TENSOR EXPRESSIONS.
RECTANGLE'S LAMINA INERTIA TENSOR ABOUT ITS CENTER OF MASS.
MXX ← B*B*AREA/12; (B HEIGHT IN ROWS).
MYY ← A*A*AREA/12; (A WIDTH IN COLUMNS).
MZZ ← MXX + MYY;
PXY ← 0;
ORIENTED RIGHT TRIANGLE'S LAMINA INERTIA TENSOR ABOUT ITS CENTER OF MASS.
MXX ← B*B*AREA/18; (B HEIGHT IN ROWS).
MYY ← A*A*AREA/18; (A WIDTH IN COLUMNS).
MZZ ← MXX + MYY;
PXY ← -A*B*AREA/36;
SUMMATION OF LAMINA INERTIA TENSORS.
AREA ← (AREA1 + AREA2);
XCM ← (AREA1 * XCM1 + AREA2 * XCM2) / AREA;
YCM ← (AREA1 * YCM1 + AREA2 * YCM2) / AREA;
MXX ← MXX1 + YCM1*YCM1*AREA1 +
MXX2 + YCM2*YCM2*AREA2 - YCM*YCM*AREA;
MYY ← MYY1 + XCM1*XCM1*AREA1 +
MYY2 + XCM2*XCM2*AREA2 - XCM*XCM*AREA;
PXY ← PXY1 - XCM1*YCM1*AREA1 +
PXY2 - XCM2*YCM2*AREA2 + XCM*YCM*AREA;
ANGLE OF PRINCIPLE AXIS
PHI ← 0.5*ATAN((MYY-MXX)/(2*PXY))
PXY ← 0.5*(MYY - MXX)*TAN(2*PHI)
TRANSLATION OF LAMINA INERTIA TENSOR AWAY FROM CENTER OF MASS.
MXX' ← MXX + AREA*DY*DY;
MYY' ← MYY + AREA*DX*DX;
PXY' ← PXY - AREA*DX*DY;
ROTATION OF LAMINA INERTIA TENSOR ABOUT CENTER OF MASS.
C ← COSINE(PHI);
S ← SINE(PHI);
MXX' ← C*C*MXX + S*S*MYY - 2*C*S*PXY;
MYY' ← C*C*MYY + S*S*MXX + 2*C*S*PXY;
PXY' ← (C*C - S*S)*PXY + C*S*(MYY - MXX);
~I1973,800;F8- 51 -